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Abstract 

Using a tensorial approach, we show how to construct a one-one correspondence between 
pattern probabilities and edge parameters for any group-based model. This is a generalisa- 
tion of the "Hadamard conjugation" and is equivalent to standard results that use Fourier 
analysis. In our derivation we focus on the connections to group representation theory and 
emphasize that the inversion is possible because, under their usual definition, group-based 
models are defined for abelian groups only. We also argue that our approach is elementary 
in the sense that it can be understood as simple matrix multiplication where matrices are 
rectangular and indexed by ordered-partitions of varying sizes. 
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1 Introduction 

In a series of papers from 1989 and the early 90s, Hendy a nd colleagues introduced the Hadamard 
conjugation as a nove l tool for phylogenetic analyses ( Hendv fc Pennvl . Il989i : Irlendv . 19891 : 
Hendv &: Pennvl Il993 ). They found an invertible relationship between a phylogenetic tree, as 
characterized by an edge length spectrum, and the probability of each site pattern (referred to 
as the sequence spectrum). Originally introduced only for th e 2-state symmetri c model, the 
Hadamard conjugation was later extended to th e K3ST mode l ( Hendv et al . 19941 ) and further 



to any of the so-called "group-based" mode ls (ISzekelv et al. . 1993bh. H adamard conjugation 



has been used as both a tool for simulation ( Hendv fc Charleston! 1993 ) and to look at statis- 



tical properties of methods, exploring the in consistency of parsimony under a molecular clock 



tation in 



(IHendv & Pennvl 11 993: Ho lland et al 



2003h . For these sorts of applications, following the no- 
Felsensteinl ( 2004 ) , we can use the Hadamard transform H to start with an edge length 
spectrum 7 and calculate the sequence spectrum s = H^ 1 log(Hj). The beauty of Hadamard 
conjugations is that one can also begin with an observed sequence spectrum s and perform the 
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inverse of the conjugation to empirically obtain an edge length spectru m 7 = H 1 log (iJj). Al- 
though it is not expected that the 7 spectrum will precisely match a tree, iHendv (Il99ll) proposed 
using a optimisation criterion to map from 7 to the "closest tree" . 

Several authors have commented that it is potentially a useful feature of Hadamard conju- 
gation that data isn't forced onto a fix ed tree. The confli cting information can be retained an d 
interpreted in the form of a "lentoplot" (jLento et al ., 1995) or a splits-graph (iHuber et aZ.I. |200lh ■ 
with both of these methods implemented in Svectronet (jrluber et al . 2002 ). ISchliepI ( 2009 ) gives 
some more statistical justification for such an approach by making a link to modern statistical 
tec hniques such as the Lasso and R idge regression. 

von Hacscler & Churchilll (|l993l) seems to be the first paper that explicitly suggests using 
Hadamard conjugation to provide a likelihood framework for networks. The chief idea being 
that one can start with an edge length spectrum that encodes a set of incompatible splits, use 
the Hadamard transformation to get site proba biliti es and u s e the se to determine a likelihood. 
This idea was further explored bv lBrvant (|2005l) . and Bryant ( 2009i ) followed this through defin- 
ing the "n-taxon process" for group-based models. It should be noted that likelihoods calculated 
via Hadamard are not equivalent to l i keliho ods calculated by taking a mixture of trees. Indeed, 
Matsen fc Steel (|2007t) ; lMatsen et all (|2008f) used Hadamard methods in combination with phy- 



logenetic invariants to show that mixtures of trees with the same topology can exactly mimic 
another tree under the 2-state model. Considering biological applications, thinking in terms 
of mixtures of trees or part i tions where the data can be thought of as a rising on a set of trees 
( Griffiths fe Maioram , 1996; Griffiths fe Marioramlll997t|jin et al . 20061 ) seems more reasonable 
than the Hadamard conjugation. Strimmer fc Moultonl (|2000l) suggested using split networks as 
a spring boar d to likelihood-based a nalyses on DAGs, but later identified several problems with 
the approach ( Strimmer et al . 200lfk most notably, in split-networks internal nodes do not have 



a biol ogical interpre t ation a s an ancestor. 

In lSumner et all (|2012rJ) . we gave some additional insight into the interpretation of applying 
the Hadamard conjugation in a network setting. We showed that permutation group structure 
inherent to the Hadamard transformation - as for any group-based model - restricts the resulting 
process from being capable of reproducing truly convergent processes. This is a serious limitation, 
as one of the biological motivations for explicit network models is the ability to model convergent 
processes. We also presented an alternative algebraic formalism for the general Markov model, 
analogous to the n-taxon process, but capable of reproducing convergent processes. From the 
point of view of group representation theory, the inversion of group-based models relies on the 
fact that the irreducible representations of an abelian group are one-dimensional, and the model 
structure essentially reduces to group characters - hence the standard presentation of a Fourier 
inversion. In this article, we make this connection concrete. For the general Markov model, it 
is then immediately apparent that an analogous inversion is not possible because the underlying 
irreducible representations are not one-dimensional. In fact, to obtain one-dimensional represen- 
tations for the general Markov model, it is necessary to apply hig her-degree polynom ial maps 
(beyond the degree 1, linear case), and define "Markov invariants" ( Sumner et all . I2OO8I ) . These 
invariants present one-dimensional representations but at the cost of the higher degree - degree 5 
in the case of t he ge neral Markov model on four states on quartet trees ( Sumner fc Jarvisl 20091 : 



Holland et aZl . l2012l) . This connection between Hadamard transformation and Markov invariants 



is an interesting one, but we do not discuss it further here. 

In this paper we approach the inversion of group-based phylogenetic models by taking a 
representation-theoretic perspective and working explicitly with tensor indices. Our approac h 
rests heavily on the formalism of "phylogenetic te nsors" , as pres e nted i n Bashford et al. ( 20041) . 
for the binary-symmetric and K3ST model, and ISumner et al. (|2008l l2012bl) . for the general 
Markov model. 



2 



2 Background 



In this paper we consider the continuous-time formulation of Markov processes, and show how to 
implement the inversion of a group-based phylogenetic model based on any abelian group. We 
note that such an inversion requires a map from tensor product space (where elements are indexed 
by ordered- n-partitions) to phylogenetic splits (where elements are indexed by bipartitions) . We 
achieve this by finding canonical maps from bipartitions to ordered-n-partitions. 

For a group G with order \G\ = d, we write G = {<7i, a%, . . . , a d }, and, when necessary, 
write e G G to specify the identity element of G. Consider the vector space C d = (G) c 
a i . <72, . . . , o~d) c = {v = Viai + v 2 a 2 + . . . + v d a d ■ Vi G C}, with scalar multiplication and vector 
addition defined via 

v + Xv' — (viai + v 2 a 2 + . . . + v d a d ) + \(v[cti + v' 2 a 2 + • • • + v d a d ) 
= {vi + Xv[)ai + (v 2 + \v' 2 )<7 2 + . . . + (v d + Xv' d )(Xd, 

for all v,v' € (G)c and A G C. The regular representation, p Teg : G — > GL(d,C), is then defined 
by setting the group action 

a : v h-> av = V\{aa\) + v 2 (aa 2 ) + . . . + v d (aa d ), 

for all v G (G)c and a G G. If we fix {ai, o 2l . . . , <r d } as an ordered basis for (G)c, it is then 
clear - via Caley's theorem - that each group element a gets mapped to a permutation matrix 
K a := j o rcg (cr), with K a Ui — J^. [K^dj := ao~i. Thus K a has matrix elements 

lK v = { 1. if o-j = o-o-i, , > 

1 aU \ 0, otherwise. y ' 

Consider the unit column vectors 

6 - (1,0,0,. ..,0) T , 6 = (0,1,0,0,...,0) T , ... Cd = (0,0,...,0,l) T ; 

and identify <7j = ^ , so that the group action becomes cr : £j M> i^ CT Ci — where Oj = crci . Thus 
the matrix elements [Kg]! have i as the column label and j as the row l abel. 



With the regular representation in hand, it can then be shown (see ISumner et al\ (|2012af l) 
that the group-based model defined by G has rate matrices of the form 

Q = -X1+ 

where each < a a G K and A = X^o-eG aC ' ■ 

The regular representation is one example of the general concept of a representation of G on 
a vector space V, defined as a homomorphism p : G GL(V) satisfying p{gig 2 ) = p(gi)p(g2) for 
all gi,g 2 G G. A representation is said to be reducible if there exists a proper subspace U dV 
satisfying p(g)U C E7, i.e. the set of matrices p(G) send vectors in t/ back to U. In this case, 
U is called an invariant subspace. The representation p is then called irreducible if V does not 
contain any invariant subspaces. 



The reader should note that the usual construction of a "group-based" model (pemple fc Steel 



20031 ) stipulates that G be abelian. Although the construction just given using the regular 
representation allows for non-abelian G, we will nonetheless only consider the abelian case in 
this paper, because, as discussed in the introduction, it is only in the abelian case that a (linear) 
inversion of phyloge netic models i s possible. In this case the irreducible representations of G are 



all one-dimensional (jSaganl |2001|) . and hence r educe to the g r oup ch aracters, as is exploited in 



the previous approaches using Fourier analysis ( Szekelv et ai . 1993b[) . 
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A 

Figure 1: Markov evolution on a single followed by a branching event (illustrated on the left), 
is equivalent to a branching event on a single taxa followed by correlated Markov evolution of 
two taxa (illustrated on the right). Mathematically, this equivalence can be implemented by 
exploiting the equality given in @. 



2.1 Phylogenetic tensors 

As is shown in lSumner fc Jarvid (l2005h and in more detail in lSumner et al. I (l2012bh . phylogenetic 
distributions on the state space [d] :— {1,2,..., d} can be represented as tensors in the n- 
fold tensor product space ® n <C d := C d ® C d ® . . . ® C d . If we choose {£i,£2> ■ • ■ ,62} as an 
ordered basis for C d , and ordered basis <g> £i 2 ® . . . ® & d }ti,t 2 ,...,i„e[d] f° r the tensor product 
space, a "phylogenetic tensor" P = Y,i u i 2 ,...,i n e[d] Piii2-i n Ch ® £i 2 ® • ■ • ® &„ 6 <8>™C d has the 
interpretation that the components ft 1 i 2 ...i„ represent the probability that the n taxa take on 
the states i\, . . . ,i n respectively. 

Phylogenetic branching events can be generated by the linear operator 5 : C d — > C d g) C d 
defined on the chosen basis via 



The remarkable fact for group-based models, central to the present article, is that the rate 
matrices "intertwine" particularly simply with the branching operator: 



Thus we have 



5(K^i) = = ® =K a ®K G - <5(&). 



5 Q= [ -A1®1+ a a K a ®K a \ -5, 

e^creG 



which in turn implies (via the linearity of S) that 



5 ■ e Qt = e~ A exp 



a ° K ° 



K a \ -5. 



(2) 



This relation shows that mathematically, and hence conceptually, "Markov evolution on a single 
followed by a branching event" can be replaced with "Branching event on a single taxa followed 
by (c orrelated) Mar k ov evol ution of two taxa." This equivalence is illustrated in Figured] 

In lSumner e~t aZ| (l2f)12hh we showed how to generalise this intertwining action to the case of 
the general Markov model. Interestingly, the general intertwining has quite different structure 
from what occurs in group-based models, and the si mplicity of (|2l) i s actua lly quite misleading 
for the general Markov model. We refer the reader to ISumner et all (|2012bl ) for more discussion 
on this point. 
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{1,2,3,4,5} 



{3,4,5} 




v! } V {A 1/ v! } 

Tj (2) (3) (4) (?) 

Figure 2: A six taxa tree rooted at taxon 6 with edges labelled by subsets of {1, 2, 3, 4, 5}. 



Returning to the case of group-based models, for each subset vl C [n], we define a linear map 



as the tensor product K, 



(A) 



K°, n where a* = 1 if i € A and 



otherwise. For example, if n = 5, we have 

K {{i,m) =KtT ® Kn 



K„ 



To develop a phylogenetic tensor on a tree, we root the phylogenetic tree at taxon n, and label 
edges by subsets 7^ e C [n — 1] , where i € e if the path from taxa n to taxa i crosses the edge 
labelled by e. A five taxa tree with this labelling, is presented in Figure^ To each edge labelled 
by ^ e C [n — 1] , we assign the rate matrix 



Qe 



-A P 1 



E 



where each a° > is the rate of substitution for all states <j\ to 02 satisfying a — o^crf , and 
A e = J2aeG a e- Each edge is then assigned substitution matrix M e — e®", so that the time 

parameter for each edge is absorbed into the definit i on of Q e . 

Now iterating ^ multiple times. iBashford et all ( 2004 k [Sumner et~al. ( 2012bl ) show that any 
phylogenetic tensor can be written as 



P = e exp 




where A = 



0^eC[n-l] 



Ae = E 



^eC[n-l],e^<7£G 



(3) 



ij, and 5 n 1 tt is the d x d x . . . x d tensor that 



represents the "zero edge-length star tree" distribution on n taxa. It is this form of phylogenetic 
tensors that will do a lot of the heavy lifting in the discussion that follows. The reader should 
note that under this representation, there is no need for the edge parameters {a% : 8 / e C 
[n — 1], a G G} to be chosen to be compatible with a particular tree, hence the possibilities for 
generalising to non-tree-like or network models, as discussed in the introduction. 

The stationary distribution for group-based models is uniform (because the rate matrices are 
doubly stochastic). In this paper we always assume a stationary distribution, so that: 

7r=i(l,l,...,lf, 
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and 8 n 1 ir has tensor components 

[ S n-ii = f 3, if h = h = ■ ■ • = in, 

L J iiia. ..t« 0, otherwise. 

This concludes our discussion of the tensor presentation of phylogenetic probability distribu- 
tions under group-based models. We now review the standard Fourier analysis of these models, 
and make the connections to representation theory explicit. 



2.2 Connection to Fourier analysis 

In this subsection we briefly point out the connection between the standard Fourier analysis and 
representation theory. Understanding this connection - or indeed the underlying representation 
theory - is not required to understand our general method, so the uninterested reader may wish 
to skip forward directly to the next section. 

Given an (abelian) group-based mo del the cruc i al asp e cts of the Fourie r trans f orm that are ex- 
ploite d in the phylog e netic context fe.glChor et all (2000): Evans fc Speed! (ll993l);)Hendy fc Pennvl 

1989| ) ; lHendvl(ll989h;lHendv fc Pennvl (|l993f ); lHendv et all (|l994l ): lHendv fc Snirl(|2008l ); ISturmfels fc Sullivant 



2005); 



Szekelv et ali (|l993allbl )) are as follows. 



Result 1. Let /i,/2 be functions from a finite abelian group G to C and 1 the constant junction. 

1. The group G and the dual group G := Hom(G, t C x ) are isomorphic as abstract groups. 

2. Fourier transform turns convolution into multiplication, i.e., fx* f% = ,f\ ■ $2, and 

3. If x G G is irreducible, then l{x) — \G\ if X = l (the unit in G) and l(x) — otherwise. 

We recall the aspects of the representation theory that are needed in our discussion and 
express the above in terms of them. For t he reader who is unfamilar with the general theory, we 
recommend the excellent elementary text ISaeanl ( 200ll ) . 

Result 2. Given a representation p : G — > GL(V) and an irreducible character x ■ G — > C, the 
projectors onto the irreducible representations of G are given by 

e x : = lZT\T,aeGX(d)p(g)- 

Result 3. The regular representation contains each irreducible representation p x exactly dim(/o x ) = 
x(e) times. 

Result 4. The irreducible representations of an abelian group are one-dimensional. 

Result 5. The character table of an abelian group G diagonalizes the regular representation. 

Result 6. Any (finitely generated) abelian group G is isomorphic to a direct product of cyclic 
groups of prime-power order, ie. G = Z n x Z r2 x 7L Tq where each ri — q" z where qi is prime and 
ni is a positive integer. 

Result 7. The irreducible representation ofL r are given by Pi(cr) = to 1 with i = 0, 1, 2, . . . , r — 1 
and U! r = 1. 



Result 8. The irreducible representations of r L Tl x Z r , x ... x 7L r are given by Pi x i % ...% 

Pi! & Pi 2 

(c^r = i. 



Pil^ ® Pi^ ® ■ • ■ ® pf\ where p^\cri) — (oJi) k and {o~i) Ti — ei with Ci the identity in 7L Ti ano 



G 



Proof. The representation p '■= p\ ® P2 of G — G\ x G 2 , constructed from irreducible represen- 
tations pi,p2 of Gi,G 2 respectively, is irreducible. The result follows from induction. □ 

Result 9. The Fourier analytic results of Result]]] have representation theory counterparts: 

1. The regular representation is faithful, i.e. injective. 

2. The columns of the character table project onto the irreducible subspaces. Therefore, the 
character table of an abelian group G diagonlizes the regular representation. 

3. For an abelian group, the identity column of the character table obviously sums to \G\ and 
the other columns are orthogonal, thus the other columns sum to 0. 

In what follows, we discuss the inversion of abelian group-based models. We present the 
simplest case with G = Z 2 in Sj3J the G = Z3 case in fjU the G — Z 2 x Z 2 case in SJ5J the general 
G = Z r case in fjfO and finally we discuss the case of any abelian group in SJ71 



3 The binary- symmetric case 

We begin with the inversion of the so-called "binary-symmetric" model. Consider C 2 with 
standard basis 




As a group-based model, the binary-symmetric model arises by taking the group 

G := Z 2 = {0, l}+( mo d 2) = (o-\o- 2 = e), 
with a generic rate matrix given by 




= -l + K, 



where K = I ^ ^ J is the permutation matrix representing a in the standard basis. 
Now 

Z 2 ^M 2 (C) 
Prcg ' a^K 

is the regular representation of Z 2 , and the character table of Z 2 given in Table [1] is easily 
recognised to be the Hadamard matrix 




As Z 2 is an abelian group, the irreducible representations are one-dimensional (Result 2]). Re- 
calling Result [2 the corresponding projection operators can be read off from the columns of the 
character table. That is, the operators 

Qid ■ = \ (e + a) , 

project p rog = id® sgn onto the id and sgn representations of Z 2 , respectively. 
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id sgn 


[e] 

w 


1 1 
1 -1 



Table 1: The character table of Z2. 



This observation prompts us to work in the alternative basis: 
fo ■ = ©id ■ £0 = ©id • £1 = h£o = £0 + £1 ! 



/l : — 9s 9 n • £0 — "©sgn ' £l — — Co — £l • 

In this basis the permutation matrix is diagonal: 



K : = hKh- 1 = 
Q: = -l + K = 



1 









-1 



-2 



The representation-theoretic perspective on K is to observe that id(a) = 1 and sgn{a) 
Referring to ©, we know that we can write a generic phylogenetic tensor as 



= -1. 



P = ( ~ A exp I a e K (e ^ ] -S^tt, 

^0/eC[n-l] 

where A = E0^ e c[„-i] «e- 

We index matrix and tensor indices by using i } j, k = 0, 1 S Z2 and allow multiplication x in 
the ring of integers Z. The Hadamard matrix then has matrix elements [h]? = (— 1) IX - J where j 
is the row index and i is the column index. Observe that in the diagonal basis, the permutation 
matrix has elements 



K 



Thus we have expressions such as 



#({2,3}) 



= 8- ■ 8- ■ 8- ■ f-lV 2 + i3 



where K^ 2 ^ = 1 <g> K <g> K. 

As we are dealing with tensors of arbitrary size, it is convenient to represent a string such as 
ii%2 ■ ■ - i n as an ordered-bipartition fi — /lo^i °f the set [n], where /io, fJ-i - [n] with j £ fik if and 
only if ij = k. For example we have the following equivalences: 

00110 = {1,2,5}:{3,4}, 01111 = {1}:{2, 3, 4, 5}, 10001 = {2, 3, 4}:{1, 5}, 

and inequivalence: 

01010= {1,3,5}:{2,4} ^ {2, 4}:{1, 3, 5} = 10101. 

We then have 

8 8 (—l)\ en ^\ 





Jlja— Jn 


■£(e)" 




'k^~ 


v :u\ 




1112-. An 








fJ.O-fJ-1 
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Defining := h^ n ^ <g> h where := h, in the diagonal basis P :— ■ P and using our 
notation has tensor components 





1/ 




"0 :«*L 















The zero edge-length star-tree initial distribution has tensor components 



(where, although it seems we have given preference to the taxa 1 in this expression, there are 
many ways that this distribution can be expressed using the Sij). In the diagonal basis with 

5 n ~ 1 TT := M") • 5 n ~ 1 -rr, we have components 



— 2 Z^j lt j 2 ,...,j n \ L ) u 3l]2 u n33 ■ ■ ■ u 3l3 



= i ]P. (_l)(»i+*2+-"+»n)x.7i 
= 1 (l + (-l)H+^ + ...+in) ; 

which is exactly the statement 



Since K is diagonal in the transformed basis, we can conclude that 

1(1 + (-1)1^1). 



p 




p 


= e 




fj- 







-A 



cxp ( ^2 a < 

0#eC[2,n] 



K 



(e) 



Uo-Ui 

no-.m 



Of course many of these tensor components will be zero and we would like to ignore these. 

Take u = u^-.ui as an ordered bipartition of the reduced set [n — 1], so that u = i\i2 ■ ■ ■ i n -i 
where j G life if and only if ij = k, and define 



0, if | ui | is even, 

1, if | ui | is odd; 

2 - (OKI + l|iti|) (mod 2), 



and interpret u ■ 7(11) as a string: u ■ 7(11) = 21*2 • • • in-ili^)- 
If we make the definitions 



V u := 



u--y(u) 



2 Z^0^eC[ri-l] ' 



li-7(li) 
u-7(u) 



then wc can write the non-zero components as 

V u = e~ A cxp (riu) , 

with inverses 

Vu = In (V u ) + A. 

This is the first part of the inversion. 



(4) 
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We would like to go further and actually recover the individual edge weights a e . To do this 
we define the (square) 2™ _1 x 2™" 1 matrix F with components 



14-7(14) 
14-7(14) 



= (_l)|enti| = \ h (n-D 



with e a subset and u an ordered-bipartition of [n — 1]. As (h^ n x ^) 2 = 2 „ 1 _ 1 1, we see that -F 
provides its own inverse F^ 1 with components 

Defining the column vectors a = {a e } and ff = {t) u }, we can write the matrix equations 

rj= Fa, d = F~ 1 fj. 

Together with the first part of the inversion (J3J), these equations give a one-one map between 
pattern probabilities and edge weights for the binary-symmetric model. 



4 Inversion of the Z3 model 

Taking confidence from the previous case we now discuss the inversion of the group-based phy- 
logenetic model with G — Z3. We take Z3 = {0, 1, 2} + ( moc j 3 ) = (cr|cr 3 = e) and, by analogy to 
the Z2 case, index tensors with indices i,j = 0, 1,2 and allow multiplication x by extending Z3 
to the ring F 3 = {0, 1,2} +>X (mod 3) . 

In this case a generic rate matrix is given by 

/ -(a + /3) P 
Q = a -(a + 13) 

V P a 
= -(a + (3)1 + aKi+ f3K 2 , 

where 












( 


1 




! 







K 2 = 












1 


:■ 




V 1 








are the matrices representing the permutations a = (123) and a 2 = (132) under the regular 
representation, respectively. 

We define u — e 2jri / 3 , and present the character table of Z3 is given in Tabled] The decom- 
position of the regular representation is p rcg = id ® uj ® ui 2 , and the columns of the character 
table give the projection operators onto the (one-dimensional) irreducible subspaces: 

\{e + a + a 2 ) 
I (e + ujo + u?a 2 ) 
\ (e + uj 2 g + cjct 2 ) 



-(a + p) 



O ld : = 
6 W : = 

e,„2 : = 



Therefore, the matrix 
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id 


u> 


UJ 2 


[e] 


1 


1 


1 


M 


1 




w 2 


W 2 } 


1 


uj 2 


a; 



Table 2: The character table of Z3. 



diagonalizes the generic rate matrix for this model: 



Q = fQf' 1 = I au + Puj 2 

au? + Buj 



or, equivalently, 



1 

Ki = fKif- 1 = I u 
uj 2 



1 

K 2 = fKsf- 1 = I u? 
uj 



We recall our basic result © that for group-based models, a generic phylogenetic tensor can 
be expressed as 



P = er x exp 



,0#eC[n-l] 



rn — 3 

7T, 



where A = X^^eOn— 1] ( ae + ^ e )- We take the stationary distribution as initial distribution, so 
" v 3 ' 3 1 3 ; ■ 

The matrix elements of / can be expressed as [/H = uj 1 * 3 , where we extend i,j G Z3 to 
include multiplication x from the ring of integers Z. Similarly, 



K? 



2\i 



<^(w ) 



More generally, tensorial components can be expressed as 

hfoh 



1 <g> K x ® K 1 



A A A- cj l2+is 



We represent a string ...i n as an ordered-tripartition, ...i n = fx = Hq'-IAi".^, of the 
set [n], where j S /Ltfc if and only if ij = k. For example, if we take n = 5, we have 

00000= {1,2, 3,4, 5}:0:0, 
00120={1,2,5}:{3}:{4}, 
01122 = {1}:{2,3}:{4,5}. 



Taking n = 3, we have 



({2,3}) 



K 



and in general: 



\®Ki®Ki 



A . .|Min{2,3}|+2| Al2 n{2,3}| 



A' 



r ,|en/ii |+2|en/i 2 
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Taking the uniform distribution as initial distribution, the intial star-tree distribution can be 
written as 

\P n 1i 1 i 2 ...i n = 3^*1*2^*1*3 ■ • ' ^*l*'n ■ 

Defining /(") = ® / where /W = /, we have 



in) 



(n) 



iii 2 ...i„ 



and in the transformed basis, where 5 n ~ 1 n := • <5" 1 7r, we have 



-IV ,.,»lXji+t2Xj2+...+*nXj„J. . X. . X. . 

— 3 Lljl,j2,...,j n U 3l]2 U ]l33 ■ ■ ■ "JU» 

— I V , I ,jlX(il+»2+ — +*n) 

= |(l + W *l+*2+-+*n + ( w 2y 1 +i 2 +...+i„^ _ 



Indexing by ordered-tripartitions, we conclude that 



£n-l 7 



g (l + CJ i i+ i 2+---+i n + ( w 2^!+i 2 + ...+i n ^ 



= 3 (l + wM +2 l' t2 l + ( W 2 )M+ 2 IH) . 

Now suppose + 2|/x 2 | = (mod 3), then 



If |/L*i | + 2|/i 2 | = 1 (mod 3), then 
and if + 2|/z 2 | = 2 (mod 3), then 



i (1 + 1 + 1) = 1. 



| (1 + u + u 2 ) = 0, 



3 (1 + + w) = 0. 



Thus we have found a basis where all the elements of the initial star-tree tensor are zero unless 
the tripartion /i satisfies + 2|/U 2 | = (mod 3). Crucially, this statement also holds for the 
phylogenetic tensor P because in this basis the rate matrices of this model are diagonal: 



e -A cxp fl E 



2 Z^0^eC[ri-l] 



a e K[ e) + f3 e K. 



(e) 



Kl + w^l+w 2 ^!). 



We deal with this condition on (i by taking u = uo ; ui:u 2 as an ordered-tripartion of the 
reduced set [n — 1] and setting \i = u ■ 7(1*) (considered as the concatenation of strings) where 

f 0, if M+2|U2|=0 
7(«)= I 1, if K|+2|n 2 | = 2 
[ 2; if \ Ul \ +2\u 2 \ = 1 

= 3 - (0|m | + + 2|u 2 |) (mod 3). 
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If we make the definitions 



P 



a e K[ e) + f3 e K. 

0^eC[n-l] 



(e) 
2 



we then have the first part of the inversion 

V u = e~ x exp (r] u ) , r] u =ln(V u ) + X. 



(5) 



As in the Z2 case, we would like to use r\ u to recover the rate parameters a e ,j3 e for all 
I ^ e C [n — 1] and thus complete the full inversion for this model. Of course, it is little bit 
more difficult this time. 

Recall that /j, = with /x* C [n], whereas u — Uo-U\:U2 with C [n— 1], and 

^ e C [n — 1]. Considering 



(e) 



it follows that 



and similarly 



(e) 



u-~](u) 
u-~/(u) 



|eriMl|+2|enp2| 



|eriMi|+2|enM2 



K. 



(e) 



u-"f(u) 



= OJ 



|enM 2 |+2|en«i| 



We make the observation that 



(n-l) 



and 



mi := 



|iiine|+2|« 2 ne| 



_ CJ |«2ne|+2|t 11 ne| _ 



K 



(e) 



u-^f(u) 
u--y(u) 

u-'y(u) 
- u--y(u) 



where F\ and F 2 are 2 n 1 x 3 n 1 matrices. 
Thus we may write 

Vu = ]T a e [F 1 } e u + l3 e [F 2 r u . 

0#eC[n-l] 

Defining the column vectors a — {a e }, P = {Pe} and ff = {?7 M }, we can write 

rf= F 1 a + F 2 f3, 
and define two 3 n_1 x 2™ _1 matrices G\ and G 2 as 



r 



-l(n-l)- 
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where 



with ff- 1 = 1. 

Considering that 



r 1 




E 



p(n-l) 



for all ordered-triparitions u,u> of [n — 1], we have the matrix products 

G1F1 = 1, GiF 2 = 0, 
G 2 F 2 = 1, G 2 F 1 = 0. 

Thus the second part of the inversion for this model is 

a = Girf, f3 = G 2 ff. 

Together with ((SJ), these equations give a one-one map between pattern probabilities and edge 
weights for the group-based model with G = Z3. 



5 Inversion of the K3ST model 



We now consider the K3ST model dKimural . [l98lh which occurs as the group-based model with 
G = Z 2 x Z 2 = {(0, 0), (0, 1), (1, 0), (1, 1)}+ (mod 2) £* ((12)(34), (13)(24)). In this model a generic 
rate matrix is given by 



(a + (3 + 7) 1 + aK i + /3K W + jK lu 



where 



01 



K = 



Ku = K®K = 



I 1 \ 

10 

1 

\ 1 J 

( 1 \ 

10 
10 

\ 1 J 



K w = K ( 



(° 





1 





\ 











1 


1 













V 


1 











(6) 



We already know that the 2x2 Hadamard matrix h diagonalizes K , so we see immediately that 
H = h®h diagonalizes this model: 



Kn 



HKmH- 1 = h®h-l®K-h- 1 ®h- 1 = l^hKh- 1 



K 10 : = HKwH- 1 



( 1 











\ 





1 














-1 







V 








-1 


) 



Ku := HKnH- 1 



/I 








^ 









-1 


















1 





J 




V 








- 1 J 








( 1 











\ 


-1 _ 





-1 
















-1 









V 








1 





14 



Of course H is the character table of Z2 X Z2 and the permutation matrices ©, together 
with Kqq :— 1, give the regular representation p rcg = id® id® id® sgn ® sgn ®id® sgn ® sgn, 
where we recall the basic result that the tensor product of two irreducible representations of a 
group G gives an irreducible representation of G x G. 

Simplifying notation, for this model we index tensors with indices given as pairs: i,j = 
00, 01, 10, 11 € Z2 x Z2; and we express the individual parts using lower case Roman characters. 
For example, we write i := ab = 01, with a = and 6=1. This gives matrix elements: 



SacSbd{-~L) b , 
Sachd(-~t) a , 





cd 


K01 






ab 




cd 


Kio 






ab 




cd 


K u 






ab 



and more complicated tensor products such as 

-1 Cidic 2 d 2 c 3 d 3 



Kqi ® K i 



aib 1 a2b 2 a 3 b 3 



^a\c\ $bid\ &CL2C2 ^&2<^2 ^a 3 c 3 &b 3 d 3 ( 1 



,&l+&2 



Again we interpret strings such as fj, = a\a,2 ■ ■ .a n and v = b\b2 . ■ ■ b n as ordered-bipartitions 
u = uq:ui and v = v^,:v\ of the set [n\. We can then write matrix elements of tensor products as 



ft , V 

fj. . V 
fj. : f' 

//,/' 



\ \ef\vt\ 



— i) ,fi ,(^\)\ e n^i\ 

— ,f\ ,(— 1 ^|en/ii| + |eni/i| 



Taking the stationary distribution tt = 4(1,1,1,1) as initial distribution, the zero edge- 
length star-tree distribution is given by 



L J li l 

which in the finer index representation is 

\P ^1 a 1 b 1 a 2 bo...a n b n = 4^ a i a 2 a 3 ■ • • ^<"i a„ <^i>i&2 $bi b 3 • ■ ■ ^bib n ■ 

Recall that elements of the Hadamard matrix can be written as [h]% = (— l) axb , where 
a, b G Z2 and we allow multiplication x by extending to the ring of integers Z. In the transformed 
basis, we have 



a± b\a2 b2 ■ ■ -Q-nbn 



1 T^llcZ'-'-'cn IC 2 2 ■ ■ ■ [Kz [ h \d\ \ h \% ■ ■ ■ {htdy^Sa,^ ■ ■ ■ s aia j blb2 s blbs . . . s blbn 

1 d ^_]_^(oi+02+— a n )xci+(6i+6 2 +— +6„)Xrfi 

i (l + (_]^ai+a 2 + ...+a„ _|_ ^_^b 1 +b 2 + . . .+b n _|_ ^_-y^a,i+a2+—+a n +bi+b 2 +...+b n \ 

0, if either or \v±\ is odd; 

1, if I and \v±\ are both even. 
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We recall ([3]), so under this model we can express a generic phylogenetic tensor as 



P = e- x cxp Y, a * K oi + PMo + JeKif ■ 5 n - x -K. 

\0#eC[n-l] / 

To exclude the vanishing components we define, for all ordered bipartitions u = uq-.ui of the 
reduced set [n — 1] , 



0, if \m\ is even, 

1, if |wi| is odd; 

= 2-(0|«o|+l|«i|) (mod 2), 



and intepret u ■ "f(u) as the string u ■ 7(14) = &i£i2 . . . a„_i7(u). Then, for each pair u,v of 
ordered-bipartitions of [n — 1] , we define 



ae^' + rt+rf 

0^eC[n-l] 



n-7(n),i?-7(i;) 



and 



This gives the inversion 



V u ,v '■— [P] u - 1 (u),v-'i(v)-' 

T~*u,v — ^ exp (t] u v } , 
= A + In (V u ,v) ■ 

Consider the 2" x 2"" 1 rectangular matrices Fqi, F%o and Fn with components 



l F ^Tu,v 



K, 



K 



(— i)l e n«i|+|en«i|. 



in J 



A" 



(_l)l«n«il, 



where e C [n — 1] and u = uq\u\ and v = v$:vi are ordered-bipartitions of [n — 1]. If we define 
the column vector ff := {r) UtV } indexed by pairs of ordered-bipartitions and the column vectors 
a := {a e }, /3 :— {a e } and 7 := {a e } indexed by subsets of [n — 1], we then have the matrix 
equation 

ff = F ia + F 10 (3 + F117. 
Writing = ® H with ffM = H, we note that 



[Foi 


e = 

U,U 




1)" 


0,e 


[F w 


e = 

u,v 




1)" 


e,0 


[F n 


e = 


H (n- 


1)" 


e,e 
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and define the 2" 1 x 2™ rectangular matrices Goi, Gio an d Gu as 



[Goi 


u,v 

e 


jj- 1 («-!)" 


0,e 


[Gio 


u,v 

e 


iC™- 1 )" 


u,u 
e,0 




u,v 

e 


'jy- 1 («-!)" 


u,u 
e,e 



Noting that 

E^- 1(B "1""^ (B - 1 f , *=^, 

— L J u,v L J u; x 

for all u, w, y, z ordered-bipartitions of [n — 1], we then have the matrix identities 
Goi-Foi = 1, Gio-Fio = 1, Gii-Fn = 1, 

and 

Goi-Fio = = Goi-Fu = GpFn = Gio-Fbi = GnF m = GnFi . 

Writing 

a = G i?7, (3 = G w ff, 7 = G1117, 
completes the inversion for the K3ST model. 

6 Inversion of the Z r model 

We now consider the group based model for Z r = {0, 1, 2, ... (r — l)} +( - mod r ) — (c : c r = e). For 
this model the generic rate matrix has the form 

r 

Q = -Al + E otK a i , 

i=l 

/ ... 1 \ 

1 

1 ... 

V ... 1 / 

so that K a i — K l a . 

Defining cj = e 27 "/ r , we have w r = 1 and 1 + w + + . . . + Lu r ~ 1 = and [f]? — w y where 
i, j = 0, 1, 2, . . . , r — 1. Of course, / is the character table of Z r and [Z^ 1 ]* = p w - *- 7 '. 

Lemma 6.1. 

2 [/ ® / ® . . . ® /]; [r 1 ® /- 1 ® . . . ® r 1 ] f = v . 

where fi, v, fi' are ordered-r -partitions of the set [n] corresponding to the strings ...i n , ■ ■ ■ J, 
and kxk 2 ■ . ■ k n . 



where A = J2i=i a% an d 
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Proof. The result is obvious by the definition of tensor product. However, explicitly we have 

^[/®/®...®/];[r 1 ®/- 1 ®...®/- 1 ]^ 

-IV /,,»ljl+*2j2 + -"»r-ljr-l,,,-(jlfel+j2fe2 + -"+jnfcn) 

- r « 2^0<ji,j 2 ,-,jr-l<(r-l) W W 

JL\^_ w jl(*l-fel)+j2(»2-fe2) + -"+jn(*n-fcn) 



= ir 

r n Z_»0<ji,, 



J2,."ijr-l<(r — 1) 



which clearly equals 1 ifie—ke = for all £, and, by repeatedly applying 1+uj+uj 2 +. . .+cj r 1 = 0, 
equals otherwise. □ 

The regular representation contains exactly one copy of every irreducible representation and 
the irreducible representations of 7L r are given by the powers of uj: 

1 r -> C 
Pl ' o i — y lj^ 

Thus the change of basis K a i i-> K a i = fK a if _1 will give diagonal matrices K a i. Additionally, 
Lemma 6.2. In £/ie diagonal basis, the matrices K a i := fK a if have matrix elements K a s = 

J i 

Proof. Consider the matrix elements [-K^]^ = $ia s (j)- Thus 

where we have used uj' jS(m ' > = uj m+s . □ 
Now 



— I V (i .»lJl+*2j2 + ...+tnJn ( S. - A- ■ A- - 

- I V w jl(*l+»2 + -"+*n) 

1 if «i + z 2 + • • • + in = (mod r) 



and 



0, otherwise. 

Translating this result using the ordered-r-partitions for indices, we have 

Lemma 6.3. In the diagonal basis, the uniform initial distribution on the star tree has compo- 
nents 

1 i/0|Mo| + lM+2M + ... + (r-l)|/V_i| =0 (mod r) 
7 otherwise. 

where \i — ^o'-fJ-V^'- • • • -fJ-r-i is an ordered-r -partition of the set [n]. 





A 
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Again recall that for this model a generic phylogenetic tensor can be written as 



P = e~ x exp [ 

v05^eC[n-l],se[r-l] 



where n = 1, . . . , 1) T . In the diagonal basis P :— f( n > P and as a consequence of Lemma 16.31 

P will have many vanishing components. To avoid these we take u = uq:u\.U2'- ■ ■ ■ :u r — l as an 
ordered-r -partition of [n — 1] and set 



j(u) = r - (0|u + l|ui + 2|u 2 + . . . + (r - l)|ti r -i|) (mod r). 



If we define V u 



P 



u-~f{u) 



and 



Vu ■- 



0/eC[n-l],se[r-l] 



(e) 



we then have the first part of the inversion for the Z r model: 

V u = e~ A exp(?7„) , 
Vu = In (V u ) + A. 

For each i 6 [r — 1], we define the column vectors cfj := { a e}0^ e c[n-i]' anc ^' ^ or eacn ^ ^ 
e C [n — 1] and u an ordered- (r — l)-partition of [n — 1], we define the rectangular r™ -1 x 2™~ 1 
matrices 



Fi] 



A" 



0) 



14-7(l4) 



[F 2 



K 



- u-^y(u) 

so we have the vector equation 

77 = Fi<il + A^ 

We claim that 
Lemma 6.4. 



[*V-l] 



F r _id r -i. 



14-7(14) 
14-7(14) 



[F 

[Fa 



(n-l) 



f(n-l) 



(J) 



Proof. We recall that K a s 
[n] , and e a subset of [n — 1 



[Fr-j: 



(n-l) 



e c :0:0:0:...:e 



A' 



= u) ls 6ij, so, for /i = \iq:\x\\\ii'- ■ ■ ■ -f-r-i an ordered-r-parition of 

i 

we have 

_ s(0|u o ne| + l|u 1 ne| + ... + (r-l)|u r _ 1 ne|) 
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K. 



(e) 



u-y(u) 



s(0|« o ne|+l|uine|+...+(r— l)|u r _ine|) 



because e C [n — 1] . On the other hand [f]? = a/- 7 , so 



(n-l) 



e :»:e:t 



_ w s(0|u o ne|+l|uine|+...+(r-l)|u r -ine|) 



where e appears in the s th position. 

Define, for i G [i — 1], the rectangular 2 r 

[GxT e : = [/" 



[G 2 



r 



2 n- 


1 X 


r n 1 matrices 


-l(" 


-1)" 


e c - 7 (u):e:0:0:...:0 








-l(" 


-1)" 


e c - 7 (u):0:e:0:...:0 






w-"y(u) 



□ 



e c - 7 (u):0:0:0:. 
u-7(u) 



Of course Gj.Fj = 1, so we now have the second part of the inversion: 

cSi = Gi-r). 

7 Inversion of any abelian group-based model 

Lemma 7.1. Any (finitely generated) abelian group G is isomorphic to a direct product of cyclic 
groups of prime-power order, ie. G = Z ri x Z r2 x . . . x Z r<! where each r,i — p"* where pi is prime 
and n, is a positive integer. 

Lemma 7.2. The group-based model arising from the G is defined only up to group isomorphisms 

of a. 

Proof. A generic rate matrix for the group-based model arsing from G is given by 

q = -\i+ a<7K °- 

Under a group isomorphism <f> : G —> G', we have <fr(o~iO~j) = 4>{ui)4>{aj). 

Recall (UJ), so that the matrix elements [Kg]? is set via the action cr, n> ooi = aj. If we 
consider the regular representation of G' we then have [-^(o-)]^ defined by (f)(o~i) M> (f>(a)(j)(ai). 
Now 0(<7)</>(c7i) = 4>(cro~i) — <fi(o~j) and, because <p is a group isomorphism, this occurs if and only 
if ooi — o-j. Thus [-K"^,^)] 3 = [K a fi f° r an i an( i 3- d 

This means that we can restrict attention to a single representitive in the isomorphism class 
of G. Of course, for this purpose we choose the representative guaranteed by Lemma l7.ll 

Thus, for any abelian group G, with generators a\, 02, . . . , o~ q , as per Lemma l7.1[ the corre- 
sponding group-based model has rate generators given by 



L a = -1 + K^i ®K n 



K„ 
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for all e ^ a = (a™ 1 , a™ 2 , . . . , (J™ q ) € G, where K ai is the permutation matrix representing the 
generator cr, e 7L Ti . The character table / of G is simply the tensor product of the individual 
character tables of the 7L Ti : 

f = fl ® fa ® • • ■ ® /g- 

In the diagonal basis we have matrix elements 



where Wfc is a fc t/l root of unity. Thus 



...K„ 



3U2-J, 



We write phylogcnctic tensors for this model in the form Pi 11 i 12 ...i ln ,i 2 ii22---i2n i q ii q 2-.-i q n > 

where < i S j < r s for all < s < q. We simplify notation by writing each group of indices as 
/j,( s ) := i sl i s2 . . . i sn where ^ is an ordered-r s -partition of [n]. 

Lemma 7.3. In the diagonal basis, the uniform initial distribution on the star tree has compo- 
nents 



f 1, if,0\$\ + l\tf 
\ 0, otherwise. 



W| + . . . + (r< - = 0, Vi; 



^V 2 ). ..,*(«> 

A generic phylogenetic tensor for this model can be expressed as 



P = er x cxp 



\0^eC[n-l],Si€[ri-l] " / 



where w is the unifrom distribution on r « states, i.e. it — r i) 1 (1 , 1 , - - - , 1) T - 

In the diagonal basis P = (fa (8) fa ® . . . ® /g)^ ■ P, and, as a consequence of the previous 
lemma, P has many vanishing components. To avoid these, for each i e [g] we take uW = 
Uq^ :it2^ : • • ■ as an ordered-^ -partition of [n — 1] and set 



7i(u«) = n - (0\4 ] \ + l|u«| + 2K| + . . . + ( r< - 1)K^|) (mod r). 
We then define 



,Wi 



.(*) 



Pt 1 (i)i 1 P)...uM : " 



M (l). 7l ( M <l))«(2). 72 ( M <2))... M (9). 7l („(<,)) 



and 



? 7u(l)«( 2 )...«<9) : ~ 



y c^-^i ® (e ; 2 ® . . . ® £ (e ;„ 

0^eC[n-l],«ie[ri-l] 



i< 1 >-7i(«< 1 >)M< 2 >-7 2 («< 2 >)...M<' I >-7 1 («<' ! >) 



M (l). 7l ( M <l)) M <2). 72 („<2)) ... u (9). 7l („(«)) 



so that we have the first part of the inversion 

Ptt(i)u( 2 ). ..«(«) = e_A ex P (^WmI 2 ). ..«(«)) : 
Vu^uW. ..«(«) = ^ + l n (P«< 1 )«< 2 )...«<'!)) ■ 



21 



We define the column vectors d SlS2 --- Sq := {al lS2 "' Sq }(/ ) ^ e tz[ n -i] and ^ := {?7 u (i) u (2) . } where 
Ui is an ordered-r^-partition of [n — 1], and the (r\T2 ■ ■ ■ r q ) n x 2 n matrices 



[ F s lS2 ...s g ] UiU2 



ill -7( u l) 
til -7("l) 



u 2 -~f(u 2 ) 



U 2 -j(u 2 ) 



n 



(n-1) 



':e:0:...:0 



(n-l)' 



e :W:...:W:e:l 



/•("-!) 

J Q 



e c :0:...:0:e:0:...:0 



where in each term e appears in the s| position and the equality follows from Lemma 
We can then write the vector equation 



ff= 

siS2...s q :l<Si<ri — l 



JP -*s 1 s 2 —s q 
sis 2 ...s„« 



If we define the 2" 1 x (r\ri . . . r q ) n 1 matrices 

e c :0:...:0:e:0:...:0 



[G SlS2 ... Sg ] ( 



A 



l(n-l) 



A" 



e c :0:...:0:e:0:...:l2 



e c :0:...:0:e:( 



where in each term e appears in the s\ position, we have the orthogonality relations 

G S iS2---S q Ps' 1 s' 2 ...s' q ^S 1 s' 1 ^S 2 s' 2 ■••fisgS'gl- 

This gives us the second part of the inversion of any group-based model: 



8 Conclusion 

In this article we have given an alternative derivation of the inversion of group-based phylogenetic 
models. Primarily our method relies on the remarkable intertwining relation between branching 
events and Markov evolution @, and the resulting simplified expression of phylogenetic tensors 
given in ([3]). From there we took a representation theoretic approach concentrating on the 
structure of tensor indices. 
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